Prediction of subcellular localizations using amino acid composition and order.
نویسندگان
چکیده
Subcellular localization is important for proteins to function. For the prediction of subcellular localizations, we have developed a method, SortPred, using the amino acid composition and order. The composition represents the global features, e.g., the amino acid composition in the full or partial sequences, while the order represents the local features, e.g., the amino acid sequence order. The former was represented by neural networks and the latter was represented by a hidden Markov model. This method predicted the signal peptides (SP), the mitochondrial targeting peptides (mTP), the chloroplast transit peptides (cTP), and the nuclear or cytosolic sequences (other) comparing together the previous methods, this method achieved slightly higher prediction accuracy, 86% for plant and 91% for non-plant. We analyzed the trained neural networks and hidden Markov models and found out that these models well represent the biological features of the sequences.
منابع مشابه
MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition
MOTIVATION Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting proces...
متن کاملUsing N-terminal targeting sequences, amino acid composition, and sequence motifs for predicting protein subcellular localizations
Functional annotation of unknown proteins is a major goal in proteomics. A key step in this annotation process is the definition of a protein’s subcellular localization. As a consequence, numerous prediction techniques for localization have been developed over the years. These methods typically focus on a single underlying biological aspect or predict a subset of all possible subcellular locali...
متن کاملImproving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding
Computational prediction of protein localization is one common way to characterize the functions of newly sequenced proteins. Sequence features such as amino acid (AA) composition have been widely used for subcellular localization prediction due to their simplicity while suffering from low coverage and low prediction accuracy. We present a physichemical encoding method that maps protein sequenc...
متن کاملSubCellProt: Predicting Protein Subcellular Localization Using Machine Learning Approaches
High-throughput genome sequencing projects continue to churn out enormous amounts of raw sequence data. However, most of this raw sequence data is unannotated and, hence, not very useful. Among the various approaches to decipher the function of a protein, one is to determine its localization. Experimental approaches for proteome annotation including determination of a protein's subcellular loca...
متن کاملA New Ensemble Scheme for Predicting Human Proteins Subcellular Locations
Predicting subcellular localizations of human proteins become crucial, when new unknown proteins sequences do not have significant homology to proteins of known subcellular locations. In this paper, we present a novel approach to develop CE-Hum-PLoc system. Individual classifiers are created by selecting a fixed learning algorithm from a pool of base learners and then trained by varying feature...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome informatics. International Conference on Genome Informatics
دوره 12 شماره
صفحات -
تاریخ انتشار 2001